One of the most difficult parts of working with XML is getting content from its original format into XML format. A QuarkXPress document may be organized with style sheets and other conventions, but how do you translate that kind of organization into XML?
Avenue.quark helps to automate this process. Given a QuarkXPress document and a DTD, avenue.quark lets you create a "tagging rule set," which can automatically map combinations of QuarkXPress style sheets, colors, and type styles to element types in a DTD.
A tagging rule set lets you associate QuarkXPress style sheets, and text styles with elements in a DTD. You can use a tagging rule set to automate part of the process of tagging a QuarkXPress document.
For information on how to use tagging rule sets in rule-based tagging, see Chapter 6, "Tagging Content."
What is a tagging rule set?
A tagging rule set lets you specify that when you use rule-based tagging, content that meets a specific set of criteria should be tagged with a particular element name. For example, you could set up a tagging rule indicating that each paragraph that uses the "Headline" paragraph style sheet should be tagged as a <headline> element.
A tagging rule set is a named set of tagging rules that are all based on a single DTD. Each tagging rule specifies which style sheets, colors, and text styles should be mapped to its corresponding element. For example, the tagging rule in the illustration below indicates that text that uses the "01 Title" style sheet should be tagged with the <title> element type:
Tagging rule sets let you control how rule-based tagging is applied.
You could add another rule to specify that italicized text in paragraphs that use the "01 Title" style sheet should be tagged with <emphasis> tags, like so:
Tagging rule sets let you nest elements within other elements.
Given the two tagging rules specified above, a paragraph that uses the "01 Title" paragraph style sheet and contains italic text might be tagged as follows:
<title>What the Maid <emphasis>Really </emphasis>Saw</title>In order for the selected element type to be used, all of the criteria in the Rule Settings area must be met. For example, the following tagging rule indicates that only text that uses the "02 Author" paragraph style sheet and is red and is bold should be tagged with the <author> element type:
All tagging rule criteria must be met for a tag to be used.
If there is more than one kind of formatting you want mapped to a particular element type, you can simply create additional rules for that element type. For example, say you have two different paragraph style sheets for names; one style sheet for the first name in a list, and another style sheet for the other names in the list. (This is commonly done for spacing reasons.) You could simply create two tagging rules for the <name> element type, one that maps the "First Name" style sheet to <name> and one that maps the "Remaining Names" style sheet to <name>. Avenue.quark would then tag paragraphs that met either rule's criteria as <name> elements.
Who creates tagging rule sets? In many workflows, only administrative personnel should create tagging rule sets.
How rule-based text tagging works
When you use rule-based tagging on a box full of text, avenue.quark goes through that text from beginning to end and tries to tag the text to match the DTD. At any given point in this process, avenue.quark is looking ahead to see if it can find text that matches a rule that fits the DTD.
Text that cannot be tagged according to any tagging rule is ignored.
Tagging rule conflicts
Let's say you've created a tagging rule set containing two rules. The first rule says to tag text that uses "Body Text" as a <body> element. The second rule says to tag text that uses "Body Text" as a <paragraph> element. What happens if you use this tagging rule set on a box containing a paragraph of text that uses the "Body Text" style sheet?
The answer is, avenue.quark would display a dialog box asking you which element type you'd like to use. The Tagging Rule Conflict dialog box is displayed whenever two or more rules could apply to the same text.
What if you wanted avenue.quark to tag the same text twice, and put copies of the text into both a <body> element and a <paragraph> element? You could create two tagging rule sets one that says to tag "Body Text" as a <body> element, and one that says to tag "Body Text" as a <paragraph> element and then perform rule-based tagging on the same text twice, once with each tagging rule set.
Tagging Rule Conflict dialog box
A tagging rule set lets you associate QuarkXPress style sheets, colors, and text styles with elements in a DTD. You can use a tagging rule set to automate part of the process of tagging a QuarkXPress document.
For information on how to use tagging rule sets in rule-based tagging, see Chapter 6, "Tagging Content."
Creating a tagging rule set
A tagging rule set lets you specify how text should be tagged when you use rule-based tagging. To create a tagging rule set:
1. Create or open the XML document for which you want to create a tagging rule set.
2. Create or open a QuarkXPress document that contains all of the style sheets and colors you want to use in the tagging rule set.
3. Choose Edit > Tagging Rules. The Tagging Rules dialog box displays.
Create a new tagging rule set from the Tagging Rules dialog box.
4. Click the New Set button to create a new tagging rule set. The Edit Tagging Rules dialog box is displayed, and the DOCTYPE's root element and file name are listed in the title bar.
The Edit Tagging Rules dialog box lets you create and edit a tagging rule set.
5. Enter a name for the tagging rule set in the Name field.
6. Select a bold element type in the list on the left. (If an element type's name is unavailable, that means the DTD does not allow it to be associated with rules.) To expand a container element and display all the elements it contains, click the ~~ (Mac OS) or ~~ (Windows) icon next to that element. To view more of the DTD, scroll through the list.
7. To begin adding a new rule to the tagging rule set, click Add Rule. A blank rule is added to the Rules list.
8. To configure the tagging rule to automatically tag text that uses a particular style sheet, click Style Sheet and then choose a style sheet name from the Style Sheet pop-up menu. If you want a consecutive series of paragraphs that use the indicated paragraph style sheet to be inserted into separate elements, check New tag for each paragraph; if you want a consecutive series of paragraphs that use the indicated style sheet to be inserted into a single element, leave this box unchecked. Style sheets displayed in italics are not present in the active QuarkXPress document.
In order for the New tag for each paragraph option to work, the DTD must support multiple sequential occurrences of the selected element.
9. To configure the tagging rule to automatically tag text that uses a particular color, click Color and then choose a color name from the Color pop-up menu. Color names displayed in italics are not present in the active QuarkXPress document.
Tagging rule sets contain only the names of style sheets and colors. If you change the name of a style sheet or color in the document, you must update the tagging rule set as well.
10. To configure the tagging rule to automatically tag text that uses a particular combination of type styles, click Type Style and then click the icons to indicate which type styles should be tagged. A type style icon with a black background indicates that text must use this type style to be tagged; a type style icon with a white background indicates that text with this type style will not be tagged; and a type style icon with a gray background indicates that this type style will not be taken into account during rule-based tagging.
Remember that text is not tagged until you perform rule-based tagging on it. For more information about rule-based tagging, see Chapter 6, "Tagging Content."
11. To add a new rule for the selected element type, click Add Rule and then repeat Steps 8 through 10. To base a new rule on an existing rule, select the existing rule in the Rules list; click Duplicate to create a copy of that rule; and then reconfigure the duplicate rule.
12. To delete a rule for the selected element type, select the rule in the Rules list and then click Delete.
Element types for which rules have been created display italicized in the DTD list.
13. To save your changes to the tagging rule set, click OK.
14. Click Save to close the Tagging Rules dialog box.
If an element type occurs more than once in the DTD tree, creating a rule for one occurrences applies that rule to all occurrences.
What if you want to create a tagging rule set that includes rules for style sheets from several different documents? Just create a new document, append all of the style sheets from their various documents (File > Append > Style Sheets tab), and then create your tagging rules.
Editing, duplicating, and deleting tagging rule sets
The Tagging Rules dialog box (Edit menu) lets you edit, duplicate, and delete tagging rule sets. Simply select a tagging rule set in the list and click one of these buttons: